An evaluation of keyword spotting performance utilizing false alarm rejection based on prosodic information
نویسندگان
چکیده
In this paper, we describe our effort in developing new method of false alarm rejection for keyword spotting type of speech recognition system. This false alarm rejection uses prosodic similarities, and works as posterior rescore basis. In keyword spotting, there is always false alarm problem. Here, we propose a technique to reject those false alarms using prosodic features. In Japanese, prosodic information is expressed in intonation form, while may of other languages is using stress accents. Therefore, it is easy to calculate prosodic information using fundamental frequency, so called F0, in our language. In our new keyword spotting engine, we get result by combining two scores. One is phonetic score calculated by front engine, and the other is pitch score calculated by post engine described in this paper. We have accomplished 13%(point) improvement on keyword recognition accuracy using this method. We also have proposed robust modeling method for rejection using prosodic features.
منابع مشابه
Integration of phonetic and prosodic information for robust utterance verification - Vision, Image and Signal Processing, IEE Proceedings-
Mandarin speech is known for its tonal charactcristic, and prosodic information plays an important role in Mandarin speech recognition. Driven by this propcrty, phonetic and prosodic information are integrated and used for Mandarin telephone speech keyword spotting. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 132 subsyllable models, two ...
متن کاملUtterance verification using prosodic information for Mandarin telephone speech keyword spotting
In this paper, the prosodic information, a very special and important feature in Mandarin speech, is used for Mandarin telephone speech utterance verification. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 59 context-independent subsyllables, i.e., 22 INITIAL’s and 37 FINAL’s in Mandarin speech, and one background/silence model, are used a...
متن کاملUtterance Verification Using Prosodic Information for Mandarin Telephone Speech Keyword Spotting - Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference o
In this paper, the prosodic information, a very special and important feature in Mandarin speech, is used for Mandarin telephone speech utterance verification. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 59 context-independent subsyllables, i.e., 22 m s and 37 FINAL’S in Mandarin speech, and one backgroundkilence model, are used as the b...
متن کاملAttention-based End-to-End Models for Small-Footprint Keyword Spotting
In this paper, we propose an attention-based end-to-end neural approach for small-footprint keyword spotting (KWS), which aims to simplify the pipelines of building a production-quality KWS system. Our model consists of an encoder and an attention mechanism. The encoder transforms the input signal into a high level representation using RNNs. Then the attention mechanism weights the encoder feat...
متن کاملRecovery from false rejection using statistical partial pattern trees for sentence verification
In conversational speech recognition, recognizers are generally equipped with a keyword spotting capability to accommodate a variety of speaking styles. In addition, language model incorporation generally improves the recognition performance. In conversational speech keyword spotting, there are two types of errors, false alarm and false rejection. These two types of errors are not modeled in la...
متن کامل